skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Kim, Michael"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Kaestner Pack (Ed.)
    BACKGROUND & AIMS: Lacticaseibacillus rhamnosus GG (LGG) is the world’s most consumed probiotic species but its mechanism of action on intestinal permeability and differentiation as well as its interactions with an essential source of signaling metabolites, dietary tryptophan, are incompletely studied. METHODS: Untargeted metabolomic and transcriptomic analysis were performed for LGG mono-colonized germ-free (GF) mice fed with tryptophan (trp)-free or -sufficient diets. LGG-derived metabolites were profiled in vitro under anaerobic and aerobic conditions. Multiomic correlations were performed using a newly developed metabolome-transcriptome correlating bioinformatic algorism. Newly uncovered gut barrier-modulating metabolites whose abundances are regulated by LGG and dietary trp were functionally tested in Trans-Epithelial Electrical Resistance (TEER) assay, mouse enteroid, and dextran sulfate sodium (DSS) experimental colitis. The contribution of trp-methylnicotinamide (MNA) pathway to barrier protection is delineated at specific tight junction (TJ) proteins and enterocyte-promoting factors with gain and loss of function approaches. RESULTS: LGG, strictly in the presence of dietary trp, promotes the enterocyte program and the expression of multiple TJ genes, particularly Ocln. Fecal and serum metabolites that are synergistically stimulated by LGG and dietary trp are identified. Functional evaluations revealed a novel LGG-stimulated trp-dependent Vitamin B3 metabolism pathway, with MNA unexpectedly being the most robust barrier-protective metabolite in vitro and in vivo. Reduced serum MNA is significantly associated with increased disease activity in IBD patients. Exogenous MNA enhances gut barrier in homeostasis and robustly promotes colonic healing in DSS colitis. MNA is sufficient to promote intestinal epithelial Ocln and RNF43, a master inhibitor of Wnt pathway. Blocking trp or Vitamin B3 absorption abolishes barrier recovery in vivo. CONCLUSIONS: Our study uncovers a novel LGG-regulated dietary trp-dependent production of MNA that protects gut barrier against colitis. 
    more » « less
  2. The gold-standard approaches for gleaning statistically valid conclusions from data involve random sampling from the population. Collecting properly randomized data, however, can be challenging, so modern statistical methods, including propensity score reweighting, aim to enable valid inferences when random sampling is not feasible. We put forth an approach for making inferences based on available data from a source population that may differ in composition in unknown ways from an eventual target population. Whereas propensity scoring requires a separate estimation procedure for each different target population, we show how to build a single estimator, based on source data alone, that allows for efficient and accurate estimates on any downstream target data. We demonstrate, theoretically and empirically, that our target-independent approach to inference, which we dub “universal adaptability,” is competitive with target-specific approaches that rely on propensity scoring. Our approach builds on a surprising connection between the problem of inferences in unspecified target populations and the multicalibration problem, studied in the burgeoning field of algorithmic fairness. We show how the multicalibration framework can be employed to yield valid inferences from a single source population across a diverse set of target populations. 
    more » « less
  3. Shapley value is a classic notion from game theory, historically used to quantify the contributions of individuals within groups, and more recently applied to assign values to data points when training machine learning models. Despite its foundational role, a key limitation of the data Shapley framework is that it only provides valuations for points within a fixed data set. It does not account for statistical aspects of the data and does not give a way to reason about points outside the data set. To address these limitations, we propose a novel framework -- distributional Shapley -- where the value of a point is defined in the context of an underlying data distribution. We prove that distributional Shapley has several desirable statistical properties; for example, the values are stable under perturbations to the data points themselves and to the underlying data distribution. We leverage these properties to develop a new algorithm for estimating values from data, which comes with formal guarantees and runs two orders of magnitude faster than state-of-the-art algorithms for computing the (non-distributional) data Shapley values. We apply distributional Shapley to diverse data sets and demonstrate its utility in a data market setting. 
    more » « less
  4. Many selection procedures involve ordering candidates according to their qualifications. For example, a university might order applicants according to a perceived probability of graduation within four years, and then select the top 1000 applicants. In this work, we address the problem of ranking members of a population according to their “probability” of success, based on a training set of historical binary outcome data (e.g., graduated in four years or not). We show how to obtain rankings that satisfy a number of desirable accuracy and fairness criteria, despite the coarseness of the training data. As the task of ranking is global (the rank of every individual depends not only on their own qualifications, but also on every other individuals’ qualifications), ranking is more subtle and vulnerable to manipulation than standard prediction tasks. Towards mitigating unfair discrimination caused by inaccuracies in rankings, we develop two parallel definitions of evidence-based rankings. The first definition relies on a semantic notion of domination-compatibility: if the training data suggest that members of a set S are more qualified (on average) than the members of T, then a ranking that favors T over S (where T dominates S) is blatantly inconsistent with the evidence, and likely to be discriminatory. The definition asks for domination-compatibility, not just for a pair of sets, but rather for every pair of sets from a rich collection C of subpopulations. The second definition aims at precluding even more general forms of discrimination; this notion of evidence-consistency requires that the ranking must be justified on the basis of consistency with the expectations for every set in the collection C. Somewhat surprisingly, while evidence-consistency is a strictly stronger notion than domination-compatibility when the collection C is predefined, the two notions are equivalent when the collection C may depend on the ranking in question. 
    more » « less